A lot of the discussion around this issue has happened face-to-face, over internal channels, or has been fragmented across many different bugs. I am going to try to consolidate this into one post. ## The Underlying Issue In the beginning, Firefox ran as a single process. Over time, we have made large efforts to parallelize Firefox so that it can distribute work across many different processes. Now, when Firefox opens a website, it sometimes needs to start a new process to do so. When we need to start a process, we invoke the binary at the location that our process was launched from. These processes talk to each other using a custom IPC (interprocess communication) protocol. Because we often make changes to this protocol, processes from different versions of Firefox cannot properly communicate with each other. This means that if we end up in a situation where Firefox has been updated while it is running, it will eventually open a new process and find that it is speaking an incompatible versions of IPC. At this point, Firefox has no way of launching new processes that it is compatible with, so it is unable to load more websites until it is restarted. Once it is restarted, of course, all the processes are the same version and the problem goes away. ## Bug Activation Mechanisms This problem can basically be caused in two known ways. If you are encountering this problem, and none of these situations seem to apply to you, please let us know! 1. While Firefox is running, another instance of Firefox using the same installation updates. This is typically the result of running multiple profiles simultaneously. This is also theoretically possible on a system with multiple simultaneous users but, to my knowledge, no one has ever reported this problem to us. 1. While Firefox is running, something else updates Firefox. Usually a Linux package manager. But occasionally we have seen Windows Antivirus programs (try to) do this. In theory, there are package managers for non-Linux systems that could probably cause this as well. Note that Bug 1705217 tracks occurrences of this problem caused by package managers. ## Known Possible Solutions ### Don't have Firefox self-update if another instance is running. This mechanism is described in some detail just above, in [Comment 7](https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c7). This could potentially solve the "updated by another instance of Firefox" bug, but not the "updated externally" bug. This solution raises some concerns. It may enable malware to pretend to be a running instance of Firefox in order to attempt to prevent Firefox updates. Another concern is that users with many profiles may cycle through them and never give Firefox the opportunity to update. Given the increasing hostility of the internet and the fact that browsers are given a lot of trust run potentially malicious code while still protecting users, we would prefer a solution that does a better job of keeping Firefox up-to-date. ### Fork Server On some operating systems, like Linux, you can simply `fork()` a process to get a new one without needing to bother with the executable. We could keep a process around for the purpose of forking it into other processes when we need them. The problem here is that we have many types of processes and they are all set up slightly differently. We would need the fork server to be an "undifferentiated" process that could be turned into any type of process needed. Then we would need to implement mechanisms to actually turn such a process into the necessary types. This probably will not work on Windows, because Windows does not have a `fork()` command, as far as I know. ### Start Processes with File Descriptors It may be possible to start the processes that we want by launching Firefox from its file descriptor rather than using its path. `fexecve`, for example, [does this](https://linux.die.net/man/3/fexecve). Unfortunately, the Firefox executable is not the only thing we would have to load this way. We would need `fdlopen` (which currently exists only on FreeBSD) to dynamically load libxul. There may also be issues around code accessing the installation directory directly. It seems like we should be hitting those issues now, actually, so I'm a little confused about that. It's possible that this can cause a problem now and we've mostly just gotten lucky so far. But it is likely that all of [these consumers](https://searchfox.org/mozilla-central/search?q=NS_GRE_DIR&path=&case=false®exp=false) would need to be fixed to open things with file descriptors in order for this solution to work properly. ### Mitigation via IPC versioning It is theoretically possible to limit this problem to when there are breaking changes in the IPC protocol. It might even be possible to make IPC backwards compatible. The first issue with this is that it seems, from the conversations that I have had around this, that breaking changes to the IPC protocol happen all the time. Only very, very small security patches would likely be able to take advantage of this without a lot of effort being put into backwards compatibility. Testing would also become a problem, since we cannot realistically test the interaction between any two versions of processes that might interact. ### Versioned Installations We could potentially keep multiple versions of Firefox around at once. We would then have a launcher to pick the newest version when you go to start Firefox. Old versions would eventually be removed. On Windows, this would be implemented with versioned directories (ex: "C:\Program Files\Mozilla Firefox" might contain directories for "89.0", "89.1", and "90"). On macOS, this would be implemented with versioned frameworks. This would allow us to install multiple without breaking the signature on the app. I haven't spent all that much time looking into this on Linux yet. I suspect that versioned directories might not play nicely with package managers. We may want to have the Firefox package just be a launcher that depends on another package that has the browser. Then both packages could be installed simultaneously and updating might not be a problem? It could still be a problem though if the package manager uninstalls the old version. Obviously this needs further investigation if we want to go this route. This would have the advantage of making it potentially possible to handle a new version crashing on startup by falling back to an older version. However, this also ends up being a disadvantage because it is effectively impossible to tell if the reason we crashed is our fault, or the fault of malware that _wants_ us to downgrade. Also, when falling back, profile downgrade prevention may kick in, causing the user to lose access to their profile until they can be successfully updated. Because this mechanism requires a launcher, it is likely that implementing it without regressing startup performance would be _tough_. Windows already sort of has a launcher process, which might make this slightly easier. But changing the launcher will likely require careful optimization. This implementation has some of the biggest advantages, but it would be an extremely large effort. The launcher process would take time to write, it would be very platform-specific, and will probably be difficult or impossible to update. Migrating users to the new mechanism would require a lot of careful updater work. I don't even know exactly what is involved in shipping the entirety of Firefox as a framework on macOS, but I have been told that it wouldn't be easy. And the Linux story is still a big unknown. ## Where Do We Go From Here? This is still an open question. We are continuing to search for other solutions to this problem and refine the ones that we have. I'm very reluctant to implement the first solution for the reasons already given. And the other solutions are most likely more involved than the Update team can handle without lots of additional help. In other words, they are probably impractical without buy-in from the rest of the organization. Since, for better or for worse, this problem affects a relatively small group of people, this is a tough sell.
Bug 1480452 Comment 8 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
A lot of the discussion around this issue has happened face-to-face, over internal channels, or has been fragmented across many different bugs. I am going to try to consolidate this into one post. ## The Underlying Issue In the beginning, Firefox ran as a single process. Over time, we have made large efforts to parallelize Firefox so that it can distribute work across many different processes. Now, when Firefox opens a website, it sometimes needs to start a new process to do so. When we need to start a process, we invoke the binary at the location that our process was launched from. These processes talk to each other using a custom IPC (interprocess communication) protocol. Because we often make changes to this protocol, processes from different versions of Firefox cannot properly communicate with each other. This means that if we end up in a situation where Firefox has been updated while it is running, it will eventually open a new process and find that it is speaking an incompatible version of IPC. At this point, Firefox has no way of launching new processes that it is compatible with, so it is unable to load more websites until it is restarted. Once it is restarted, of course, all the processes are the same version and the problem goes away. ## Bug Activation Mechanisms This problem can basically be caused in two known ways. If you are encountering this problem, and none of these situations seem to apply to you, please let us know! 1. While Firefox is running, another instance of Firefox using the same installation updates. This is typically the result of running multiple profiles simultaneously. This is also theoretically possible on a system with multiple simultaneous users but, to my knowledge, no one has ever reported this problem to us. 1. While Firefox is running, something else updates Firefox. Usually a Linux package manager. But occasionally we have seen Windows Antivirus programs (try to) do this. In theory, there are package managers for non-Linux systems that could probably cause this as well. Note that Bug 1705217 tracks occurrences of this problem caused by package managers. ## Known Possible Solutions ### Don't have Firefox self-update if another instance is running. This mechanism is described in some detail just above, in [Comment 7](https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c7). This could potentially solve the "updated by another instance of Firefox" bug, but not the "updated externally" bug. This solution raises some concerns. It may enable malware to pretend to be a running instance of Firefox in order to attempt to prevent Firefox updates. Another concern is that users with many profiles may cycle through them and never give Firefox the opportunity to update. Given the increasing hostility of the internet and the fact that browsers are given a lot of trust run potentially malicious code while still protecting users, we would prefer a solution that does a better job of keeping Firefox up-to-date. ### Fork Server On some operating systems, like Linux, you can simply `fork()` a process to get a new one without needing to bother with the executable. We could keep a process around for the purpose of forking it into other processes when we need them. The problem here is that we have many types of processes and they are all set up slightly differently. We would need the fork server to be an "undifferentiated" process that could be turned into any type of process needed. Then we would need to implement mechanisms to actually turn such a process into the necessary types. This probably will not work on Windows, because Windows does not have a `fork()` command, as far as I know. ### Start Processes with File Descriptors It may be possible to start the processes that we want by launching Firefox from its file descriptor rather than using its path. `fexecve`, for example, [does this](https://linux.die.net/man/3/fexecve). Unfortunately, the Firefox executable is not the only thing we would have to load this way. We would need `fdlopen` (which currently exists only on FreeBSD) to dynamically load libxul. There may also be issues around code accessing the installation directory directly. It seems like we should be hitting those issues now, actually, so I'm a little confused about that. It's possible that this can cause a problem now and we've mostly just gotten lucky so far. But it is likely that all of [these consumers](https://searchfox.org/mozilla-central/search?q=NS_GRE_DIR&path=&case=false®exp=false) would need to be fixed to open things with file descriptors in order for this solution to work properly. ### Mitigation via IPC versioning It is theoretically possible to limit this problem to when there are breaking changes in the IPC protocol. It might even be possible to make IPC backwards compatible. The first issue with this is that it seems, from the conversations that I have had around this, that breaking changes to the IPC protocol happen all the time. Only very, very small security patches would likely be able to take advantage of this without a lot of effort being put into backwards compatibility. Testing would also become a problem, since we cannot realistically test the interaction between any two versions of processes that might interact. ### Versioned Installations We could potentially keep multiple versions of Firefox around at once. We would then have a launcher to pick the newest version when you go to start Firefox. Old versions would eventually be removed. On Windows, this would be implemented with versioned directories (ex: "C:\Program Files\Mozilla Firefox" might contain directories for "89.0", "89.1", and "90"). On macOS, this would be implemented with versioned frameworks. This would allow us to install multiple without breaking the signature on the app. I haven't spent all that much time looking into this on Linux yet. I suspect that versioned directories might not play nicely with package managers. We may want to have the Firefox package just be a launcher that depends on another package that has the browser. Then both packages could be installed simultaneously and updating might not be a problem? It could still be a problem though if the package manager uninstalls the old version. Obviously this needs further investigation if we want to go this route. This would have the advantage of making it potentially possible to handle a new version crashing on startup by falling back to an older version. However, this also ends up being a disadvantage because it is effectively impossible to tell if the reason we crashed is our fault, or the fault of malware that _wants_ us to downgrade. Also, when falling back, profile downgrade prevention may kick in, causing the user to lose access to their profile until they can be successfully updated. Because this mechanism requires a launcher, it is likely that implementing it without regressing startup performance would be _tough_. Windows already sort of has a launcher process, which might make this slightly easier. But changing the launcher will likely require careful optimization. This implementation has some of the biggest advantages, but it would be an extremely large effort. The launcher process would take time to write, it would be very platform-specific, and will probably be difficult or impossible to update. Migrating users to the new mechanism would require a lot of careful updater work. I don't even know exactly what is involved in shipping the entirety of Firefox as a framework on macOS, but I have been told that it wouldn't be easy. And the Linux story is still a big unknown. ## Where Do We Go From Here? This is still an open question. We are continuing to search for other solutions to this problem and refine the ones that we have. I'm very reluctant to implement the first solution for the reasons already given. And the other solutions are most likely more involved than the Update team can handle without lots of additional help. In other words, they are probably impractical without buy-in from the rest of the organization. Since, for better or for worse, this problem affects a relatively small group of people, this is a tough sell.
A lot of the discussion around this issue has happened face-to-face, over internal channels, or has been fragmented across many different bugs. I am going to try to consolidate this into one post. ## The Underlying Issue In the beginning, Firefox ran as a single process. Over time, we have made large efforts to parallelize Firefox so that it can distribute work across many different processes. Now, when Firefox opens a website, it sometimes needs to start a new process to do so. When we need to start a process, we invoke the binary at the location that our process was launched from. These processes talk to each other using a custom IPC (interprocess communication) protocol. Because we often make changes to this protocol, processes from different versions of Firefox cannot properly communicate with each other. This means that if we end up in a situation where Firefox has been updated while it is running, it will eventually open a new process and find that it is speaking an incompatible version of IPC. At this point, Firefox has no way of launching new processes that it is compatible with, so it is unable to load more websites until it is restarted. Once it is restarted, of course, all the processes are the same version and the problem goes away. ## Bug Activation Mechanisms This problem can basically be caused in two known ways. If you are encountering this problem, and none of these situations seem to apply to you, please let us know! 1. While Firefox is running, another instance of Firefox using the same installation updates. This is typically the result of running multiple profiles simultaneously. This is also theoretically possible on a system with multiple simultaneous users but, to my knowledge, no one has ever reported this problem to us. 1. While Firefox is running, something else updates Firefox. Usually a Linux package manager. But occasionally we have seen Windows Antivirus programs (try to) do this. In theory, there are package managers for non-Linux systems that could probably cause this as well. Note that Bug 1705217 tracks occurrences of this problem caused by package managers. ## Known Possible Solutions ### Don't have Firefox self-update if another instance is running. This mechanism is described in some detail just above, in [Comment 7](https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c7). This could potentially solve the "updated by another instance of Firefox" bug, but not the "updated externally" bug. This solution raises some concerns. It may enable malware to pretend to be a running instance of Firefox in order to attempt to prevent Firefox updates. Another concern is that users with many profiles may cycle through them and never give Firefox the opportunity to update. Given the increasing hostility of the internet and the fact that browsers are given a lot of trust run potentially malicious code while still protecting users, we would prefer a solution that does a better job of keeping Firefox up-to-date. ### Fork Server On some operating systems, like Linux, you can simply `fork()` a process to get a new one without needing to bother with the executable. We could keep a process around for the purpose of forking it into other processes when we need them. The problem here is that we have many types of processes and they are all set up slightly differently. We would need the fork server to be an "undifferentiated" process that could be turned into any type of process needed. Then we would need to implement mechanisms to actually turn such a process into the necessary types. This probably will not work on Windows, because Windows does not have a `fork()` command, as far as I know. ### Start Processes with File Descriptors It may be possible to start the processes that we want by launching Firefox from its file descriptor rather than using its path. `fexecve`, for example, [does this](https://linux.die.net/man/3/fexecve). Unfortunately, the Firefox executable is not the only thing we would have to load this way. We would need `fdlopen` (which currently exists only on FreeBSD) to dynamically load libxul. There may also be issues around code accessing the installation directory directly. It seems like we should be hitting those issues now, actually, so I'm a little confused about that. It's possible that this can cause a problem now and we've mostly just gotten lucky so far. But it is likely that all of [these consumers](https://searchfox.org/mozilla-central/search?q=NS_GRE_DIR&path=&case=false®exp=false) would need to be fixed to open things with file descriptors in order for this solution to work properly. ### Mitigation via IPC versioning It is theoretically possible to limit this problem to when there are breaking changes in the IPC protocol. It might even be possible to make IPC backwards compatible. The first issue with this is that it seems, from the conversations that I have had around this, that breaking changes to the IPC protocol happen all the time. Only very, very small security patches would likely be able to take advantage of this without a lot of effort being put into backwards compatibility. Testing would also become a problem, since we cannot realistically test the interaction between any two versions of processes that might interact. ### Versioned Installations We could potentially keep multiple versions of Firefox around at once. We would then have a launcher to pick the newest version when you go to start Firefox. Old versions would eventually be removed. On Windows, this would be implemented with versioned directories (ex: "C:\Program Files\Mozilla Firefox" might contain directories for "89.0", "89.1", and "90"). On macOS, this would be implemented with versioned frameworks. This would allow us to install multiple without breaking the signature on the app. I haven't spent all that much time looking into this on Linux yet. I suspect that versioned directories might not play nicely with package managers. We may want to have the Firefox package just be a launcher that depends on another package that has the browser. Then both packages could be installed simultaneously and updating might not be a problem? It could still be a problem though if the package manager uninstalls the old version. Obviously this needs further investigation if we want to go this route. This would have the advantage of making it potentially possible to handle a new version crashing on startup by falling back to an older version. However, this also ends up being a disadvantage because it is effectively impossible to tell if the reason we crashed is our fault, or the fault of malware that _wants_ us to downgrade. Also, when falling back, profile downgrade prevention may kick in, causing the user to lose access to their profile until they can be successfully updated. Because this mechanism requires a launcher, it is likely that implementing it without regressing startup performance would be _tough_. Windows already sort of has a launcher process, which might make this slightly easier. But changing the launcher will likely require careful optimization. This implementation has some of the biggest advantages, but it would be an extremely large effort. The launcher process would take time to write, it would be very platform-specific, and will probably be difficult or impossible to update. Migrating users to the new mechanism would require a lot of careful updater work. We take a lot of command line arguments, and it's not immediately clear how easy it would be to forward them properly. I don't even know exactly what is involved in shipping the entirety of Firefox as a framework on macOS, but I have been told that it wouldn't be easy. And the Linux story is still a big unknown. ## Where Do We Go From Here? This is still an open question. We are continuing to search for other solutions to this problem and refine the ones that we have. I'm very reluctant to implement the first solution for the reasons already given. And the other solutions are most likely more involved than the Update team can handle without lots of additional help. In other words, they are probably impractical without buy-in from the rest of the organization. Since, for better or for worse, this problem affects a relatively small group of people, this is a tough sell.
A lot of the discussion around this issue has happened face-to-face, over internal channels, or has been fragmented across many different bugs. I am going to try to consolidate this into one post. ## The Underlying Issue In the beginning, Firefox ran as a single process. Over time, we have made large efforts to parallelize Firefox so that it can distribute work across many different processes. Now, when Firefox opens a website, it sometimes needs to start a new process to do so. When we need to start a process, we invoke the binary at the location that our process was launched from. These processes talk to each other using a custom IPC (interprocess communication) protocol. Because we often make changes to this protocol, processes from different versions of Firefox cannot properly communicate with each other. This means that if we end up in a situation where Firefox has been updated while it is running, it will eventually open a new process and find that it is speaking an incompatible version of IPC. At this point, Firefox has no way of launching new processes that it is compatible with, so it is unable to load more websites until it is restarted. Once it is restarted, of course, all the processes are the same version and the problem goes away. ## Bug Activation Mechanisms This problem can basically be caused in two known ways. If you are encountering this problem, and none of these situations seem to apply to you, please let us know! 1. While Firefox is running, another instance of Firefox using the same installation updates. This is typically the result of running multiple profiles simultaneously. This is also theoretically possible on a system with multiple simultaneous users but, to my knowledge, no one has ever reported this problem to us. 1. While Firefox is running, something else updates Firefox. Usually a Linux package manager. But occasionally we have seen Windows Antivirus programs (try to) do this. In theory, there are package managers for non-Linux systems that could probably cause this as well. Note that Bug 1705217 tracks occurrences of this problem caused by package managers. ## Known Possible Solutions ### Don't have Firefox self-update if another instance is running. This mechanism is described in some detail just above, in [Comment 7](https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c7). This could potentially solve the "updated by another instance of Firefox" bug, but not the "updated externally" bug. This solution raises some concerns. It may enable malware to pretend to be a running instance of Firefox in order to attempt to prevent Firefox updates. Another concern is that users with many profiles may cycle through them and never give Firefox the opportunity to update. Given the increasing hostility of the internet and the fact that browsers are given a lot of trust run potentially malicious code while still protecting users, we would prefer a solution that does a better job of keeping Firefox up-to-date. ### Fork Server On some operating systems, like Linux, you can simply `fork()` a process to get a new one without needing to bother with the executable. We could keep a process around for the purpose of forking it into other processes when we need them. The problem here is that we have many types of processes and they are all set up slightly differently. We would need the fork server to be an "undifferentiated" process that could be turned into any type of process needed. Then we would need to implement mechanisms to actually turn such a process into the necessary types. This probably will not work on Windows, because Windows does not have a `fork()` command, as far as I know. ### Start Processes with File Descriptors It may be possible to start the processes that we want by launching Firefox from its file descriptor rather than using its path. `fexecve`, for example, [does this](https://linux.die.net/man/3/fexecve). Unfortunately, the Firefox executable is not the only thing we would have to load this way. We would need `fdlopen` (which currently exists only on FreeBSD) to dynamically load libxul. There may also be issues around code accessing the installation directory directly. It seems like we should be hitting those issues now, actually, so I'm a little confused about that. It's possible that this can cause a problem now and we've mostly just gotten lucky so far. But it is likely that all of [these consumers](https://searchfox.org/mozilla-central/search?q=NS_GRE_DIR&path=&case=false®exp=false) would need to be fixed to open things with file descriptors in order for this solution to work properly. ### Mitigation via IPC versioning It is theoretically possible to limit this problem to when there are breaking changes in the IPC protocol. It might even be possible to make IPC backwards compatible. The first issue with this is that it seems, from the conversations that I have had around this, that breaking changes to the IPC protocol happen all the time. Only very, very small security patches would likely be able to take advantage of this without a lot of effort being put into backwards compatibility. Testing would also become a problem, since we cannot realistically test the interaction between any two versions of processes that might interact. ### Versioned Installations We could potentially keep multiple versions of Firefox around at once. We would then have a launcher to pick the newest version when you go to start Firefox. Old versions would eventually be removed. On Windows, this would be implemented with versioned directories (ex: "C:\Program Files\Mozilla Firefox" might contain directories for "89.0", "89.1", and "90"). On macOS, this would be implemented with versioned frameworks. This would allow us to install multiple without breaking the signature on the app. I haven't spent all that much time looking into this on Linux yet. I suspect that versioned directories might not play nicely with package managers. We may want to have the Firefox package just be a launcher that depends on another package that has the browser. Then both packages could be installed simultaneously and updating might not be a problem? It could still be a problem though if the package manager uninstalls the old version. Obviously this needs further investigation if we want to go this route. This would have the advantage of making it potentially possible to handle a new version crashing on startup by falling back to an older version. However, this also ends up being a disadvantage because it is effectively impossible to tell if the reason we crashed is our fault, or the fault of malware that _wants_ us to downgrade. Also, when falling back, profile downgrade prevention may kick in, causing the user to lose access to their profile until they can be successfully updated. Because this mechanism requires a launcher, it is likely that implementing it without regressing startup performance would be _tough_. Windows already sort of has a launcher process, which might make this slightly easier. But changing the launcher will likely require careful optimization. This implementation has some of the biggest advantages, but it would be an extremely large effort. The launcher process would take time to write, it would be very platform-specific, and will probably be difficult or impossible to update. It would also need to handle forwarding command line options to Firefox - it's not immediately clear to me if that would be easy or problematic. Migrating users to the new mechanism would require a lot of careful updater work. I don't even know exactly what is involved in shipping the entirety of Firefox as a framework on macOS, but I have been told that it wouldn't be easy. And the Linux story is still a big unknown. ## Where Do We Go From Here? This is still an open question. We are continuing to search for other solutions to this problem and refine the ones that we have. I'm very reluctant to implement the first solution for the reasons already given. And the other solutions are most likely more involved than the Update team can handle without lots of additional help. In other words, they are probably impractical without buy-in from the rest of the organization. Since, for better or for worse, this problem affects a relatively small group of people, this is a tough sell.